Sports Analytics¶

By: Patrick Bulger, Tim Hulak, Jordan Hyatt, Cal Wardell

Introduction:¶

Strategy in American football is more important than perhaps the casual viewer may realize. Football is sometimes referred to as a "game of inches", which was taken from the quote “Football is a game of inches and inches make the champion”, credited to Hall of Fame coach Vince Lombardi.

In [1]:
# Import Dependencies 
import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import requests
import datetime as dt
from bs4 import BeautifulSoup
from os import path
from PIL import Image
from wordcloud import WordCloud, STOPWORDS, ImageColorGenerator
import random
import os
In [2]:
# Read in data
NFL = pd.read_csv("NFL_Play_By_Play_2009-2018.csv", low_memory=False)

Web Scrape Wikipedia for Season Duration¶

Define Analytical Funcions¶

In [3]:
def team_division(team):
  """Pass in team abbreviation and determine the division the team is in"""
  if team in ["SF","SEA","ARI","LA"]:
    return "NFC West"
  elif team in ["NO","TB","CAR","ATL"]:
    return "NFC South"
  elif team in ["WAS","NYG","DAL","PHI"]:
    return "NFC East"
  elif team in ["GB","CHI","MIN","DET"]:
    return "NFC North"
  elif team in ["KC","OAK","LAC","DEN"]:
    return "AFC West"
  elif team in ["TEN","IND","HOU","JAX"]:
    return "AFC South"
  elif team in ["BUF","MIA","NE","NYJ"]:
    return "AFC East"
  elif team in ["PIT","BAL","CLE","CIN"]:
    return "AFC North"
  else:
    return "Error: Team Not Found"

def team_confrence(team):
  """Pass in team abbreviation and determine the division the team is in"""
  if team in ["SF","SEA","ARI","LA","NO","TB","CAR","ATL","WAS","NYG","DAL","PHI","GB","CHI","MIN","DET"]:
    return "NFC"
  elif team in ["KC","OAK","LAC","DEN","TEN","IND","HOU","JAX","BUF","MIA","NE","NYJ","PIT","BAL","CLE","CIN"]:
    return "AFC"
  else:
    return "Error: Team Not Found"

def penalty_type(pen_team,pos_team):
    """Determine of a penalty is offensive or defensive"""
    if pen_team == pos_team:
        return "Offensive"
    else:
        return "Defensive"

About the Data¶

The dataset contains 449,371 observations across 255 variables. Each variable represents information about a given play during an NFL football game. Each record represents a specific play during a given game. The data contain information on plays from the 2009 - 2018 season.



Data Exploration¶

In [4]:
NFL.head(10)
Out[4]:
play_id game_id home_team away_team posteam posteam_type defteam side_of_field yardline_100 game_date quarter_seconds_remaining half_seconds_remaining game_seconds_remaining game_half quarter_end drive sp qtr down goal_to_go time yrdln ydstogo ydsnet desc play_type yards_gained shotgun no_huddle qb_dropback qb_kneel qb_spike qb_scramble pass_length pass_location air_yards yards_after_catch run_location run_gap field_goal_result kick_distance extra_point_result two_point_conv_result home_timeouts_remaining away_timeouts_remaining timeout timeout_team td_team posteam_timeouts_remaining defteam_timeouts_remaining total_home_score total_away_score posteam_score defteam_score score_differential posteam_score_post defteam_score_post score_differential_post no_score_prob opp_fg_prob opp_safety_prob opp_td_prob fg_prob safety_prob td_prob extra_point_prob two_point_conversion_prob ep epa total_home_epa total_away_epa total_home_rush_epa total_away_rush_epa total_home_pass_epa total_away_pass_epa air_epa yac_epa comp_air_epa comp_yac_epa total_home_comp_air_epa total_away_comp_air_epa total_home_comp_yac_epa total_away_comp_yac_epa total_home_raw_air_epa total_away_raw_air_epa total_home_raw_yac_epa total_away_raw_yac_epa wp def_wp home_wp away_wp wpa home_wp_post away_wp_post total_home_rush_wpa total_away_rush_wpa total_home_pass_wpa total_away_pass_wpa air_wpa yac_wpa comp_air_wpa comp_yac_wpa total_home_comp_air_wpa total_away_comp_air_wpa total_home_comp_yac_wpa total_away_comp_yac_wpa total_home_raw_air_wpa total_away_raw_air_wpa total_home_raw_yac_wpa total_away_raw_yac_wpa punt_blocked first_down_rush first_down_pass first_down_penalty third_down_converted third_down_failed fourth_down_converted fourth_down_failed incomplete_pass interception punt_inside_twenty punt_in_endzone punt_out_of_bounds punt_downed punt_fair_catch kickoff_inside_twenty kickoff_in_endzone kickoff_out_of_bounds kickoff_downed kickoff_fair_catch fumble_forced fumble_not_forced fumble_out_of_bounds solo_tackle safety penalty tackled_for_loss fumble_lost own_kickoff_recovery own_kickoff_recovery_td qb_hit rush_attempt pass_attempt sack touchdown pass_touchdown rush_touchdown return_touchdown extra_point_attempt two_point_attempt field_goal_attempt kickoff_attempt punt_attempt fumble complete_pass assist_tackle lateral_reception lateral_rush lateral_return lateral_recovery passer_player_id passer_player_name receiver_player_id receiver_player_name rusher_player_id rusher_player_name lateral_receiver_player_id lateral_receiver_player_name lateral_rusher_player_id lateral_rusher_player_name lateral_sack_player_id lateral_sack_player_name interception_player_id interception_player_name lateral_interception_player_id lateral_interception_player_name punt_returner_player_id punt_returner_player_name lateral_punt_returner_player_id lateral_punt_returner_player_name kickoff_returner_player_name kickoff_returner_player_id lateral_kickoff_returner_player_id lateral_kickoff_returner_player_name punter_player_id punter_player_name kicker_player_name kicker_player_id own_kickoff_recovery_player_id own_kickoff_recovery_player_name blocked_player_id blocked_player_name tackle_for_loss_1_player_id tackle_for_loss_1_player_name tackle_for_loss_2_player_id tackle_for_loss_2_player_name qb_hit_1_player_id qb_hit_1_player_name qb_hit_2_player_id qb_hit_2_player_name forced_fumble_player_1_team forced_fumble_player_1_player_id forced_fumble_player_1_player_name forced_fumble_player_2_team forced_fumble_player_2_player_id forced_fumble_player_2_player_name solo_tackle_1_team solo_tackle_2_team solo_tackle_1_player_id solo_tackle_2_player_id solo_tackle_1_player_name solo_tackle_2_player_name assist_tackle_1_player_id assist_tackle_1_player_name assist_tackle_1_team assist_tackle_2_player_id assist_tackle_2_player_name assist_tackle_2_team assist_tackle_3_player_id assist_tackle_3_player_name assist_tackle_3_team assist_tackle_4_player_id assist_tackle_4_player_name assist_tackle_4_team pass_defense_1_player_id pass_defense_1_player_name pass_defense_2_player_id pass_defense_2_player_name fumbled_1_team fumbled_1_player_id fumbled_1_player_name fumbled_2_player_id fumbled_2_player_name fumbled_2_team fumble_recovery_1_team fumble_recovery_1_yards fumble_recovery_1_player_id fumble_recovery_1_player_name fumble_recovery_2_team fumble_recovery_2_yards fumble_recovery_2_player_id fumble_recovery_2_player_name return_team return_yards penalty_team penalty_player_id penalty_player_name penalty_yards replay_or_challenge replay_or_challenge_result penalty_type defensive_two_point_attempt defensive_two_point_conv defensive_extra_point_attempt defensive_extra_point_conv
0 46 2009091000 PIT TEN PIT home TEN TEN 30.0 2009-09-10 900.0 1800.0 3600.0 Half1 0 1 0 1 NaN 0.0 15:00 TEN 30 0 0 R.Bironas kicks 67 yards from TEN 30 to PIT 3.... kickoff 0.0 0 0 0.0 0 0 0 NaN NaN NaN NaN NaN NaN NaN 67.0 NaN NaN 3 3 0.0 NaN NaN 3.0 3.0 0 0 NaN NaN NaN 0.0 0.0 0.0 0.001506 0.179749 0.006639 0.281138 0.213700 0.003592 0.313676 0.0 0.0 0.323526 2.014474 2.014474 -2.014474 0.000000 0.000000 0.000000 0.000000 NaN NaN 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 NaN NaN NaN NaN NaN NaN NaN 0.000000 0.000000 0.000000 0.000000 NaN NaN 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN S.Logan 00-0026491 NaN NaN NaN NaN R.Bironas 00-0020962 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN TEN NaN 00-0025406 NaN M.Griffin NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN PIT 39.0 NaN NaN NaN NaN 0 NaN NaN 0.0 0.0 0.0 0.0
1 68 2009091000 PIT TEN PIT home TEN PIT 58.0 2009-09-10 893.0 1793.0 3593.0 Half1 0 1 0 1 1.0 0.0 14:53 PIT 42 10 5 (14:53) B.Roethlisberger pass short left to H.... pass 5.0 0 0 1.0 0 0 0 short left -3.0 8.0 NaN NaN NaN NaN NaN NaN 3 3 0.0 NaN NaN 3.0 3.0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.000969 0.108505 0.001061 0.169117 0.293700 0.003638 0.423011 0.0 0.0 2.338000 0.077907 2.092381 -2.092381 0.000000 0.000000 0.077907 -0.077907 -0.938735 1.016643 -0.938735 1.016643 -0.938735 0.938735 1.016643 -1.016643 -0.938735 0.938735 1.016643 -1.016643 0.546433 0.453567 0.546433 0.453567 0.004655 0.551088 0.448912 0.000000 0.000000 0.004655 -0.004655 -0.028383 0.033038 -0.028383 0.033038 -0.028383 0.028383 0.033038 -0.033038 -0.028383 0.028383 0.033038 -0.033038 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 00-0022924 B.Roethlisberger 00-0017162 H.Ward NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN TEN NaN 00-0021219 NaN C.Hope NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.0 NaN NaN NaN NaN 0 NaN NaN 0.0 0.0 0.0 0.0
2 92 2009091000 PIT TEN PIT home TEN PIT 53.0 2009-09-10 856.0 1756.0 3556.0 Half1 0 1 0 1 2.0 0.0 14:16 PIT 47 5 2 (14:16) W.Parker right end to PIT 44 for -3 ya... run -3.0 0 0 0.0 0 0 0 NaN NaN NaN NaN right end NaN NaN NaN NaN 3 3 0.0 NaN NaN 3.0 3.0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.001057 0.105106 0.000981 0.162747 0.304805 0.003826 0.421478 0.0 0.0 2.415907 -1.402760 0.689621 -0.689621 -1.402760 1.402760 0.077907 -0.077907 NaN NaN 0.000000 0.000000 -0.938735 0.938735 1.016643 -1.016643 -0.938735 0.938735 1.016643 -1.016643 0.551088 0.448912 0.551088 0.448912 -0.040295 0.510793 0.489207 -0.040295 0.040295 0.004655 -0.004655 NaN NaN 0.000000 0.000000 -0.028383 0.028383 0.033038 -0.033038 -0.028383 0.028383 0.033038 -0.033038 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN NaN 00-0022250 W.Parker NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 00-0024331 S.Tulloch NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN TEN NaN 00-0024331 NaN S.Tulloch NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.0 NaN NaN NaN NaN 0 NaN NaN 0.0 0.0 0.0 0.0
3 113 2009091000 PIT TEN PIT home TEN PIT 56.0 2009-09-10 815.0 1715.0 3515.0 Half1 0 1 0 1 3.0 0.0 13:35 PIT 44 8 2 (13:35) (Shotgun) B.Roethlisberger pass incomp... pass 0.0 1 0 1.0 0 0 0 deep right 34.0 NaN NaN NaN NaN NaN NaN NaN 3 3 0.0 NaN NaN 3.0 3.0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.001434 0.149088 0.001944 0.234801 0.289336 0.004776 0.318621 0.0 0.0 1.013147 -1.712583 -1.022962 1.022962 -1.402760 1.402760 -1.634676 1.634676 3.412572 -5.125156 0.000000 0.000000 -0.938735 0.938735 1.016643 -1.016643 2.473837 -2.473837 -4.108513 4.108513 0.510793 0.489207 0.510793 0.489207 -0.049576 0.461217 0.538783 -0.040295 0.040295 -0.044921 0.044921 0.109925 -0.159501 0.000000 0.000000 -0.028383 0.028383 0.033038 -0.033038 0.081542 -0.081542 -0.126463 0.126463 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 00-0022924 B.Roethlisberger 00-0026901 M.Wallace NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.0 NaN NaN NaN NaN 0 NaN NaN 0.0 0.0 0.0 0.0
4 139 2009091000 PIT TEN PIT home TEN PIT 56.0 2009-09-10 807.0 1707.0 3507.0 Half1 0 1 0 1 4.0 0.0 13:27 PIT 44 8 2 (13:27) (Punt formation) D.Sepulveda punts 54 ... punt 0.0 0 0 0.0 0 0 0 NaN NaN NaN NaN NaN NaN NaN 54.0 NaN NaN 3 3 0.0 NaN NaN 3.0 3.0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.001861 0.213480 0.003279 0.322262 0.244603 0.006404 0.208111 0.0 0.0 -0.699436 2.097796 1.074834 -1.074834 -1.402760 1.402760 -1.634676 1.634676 NaN NaN 0.000000 0.000000 -0.938735 0.938735 1.016643 -1.016643 2.473837 -2.473837 -4.108513 4.108513 0.461217 0.538783 0.461217 0.538783 0.097712 0.558929 0.441071 -0.040295 0.040295 -0.044921 0.044921 NaN NaN 0.000000 0.000000 -0.028383 0.028383 0.033038 -0.033038 0.081542 -0.081542 -0.126463 0.126463 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 00-0025499 D.Sepulveda NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.0 NaN NaN NaN NaN 0 NaN NaN 0.0 0.0 0.0 0.0
5 162 2009091000 PIT TEN TEN away PIT TEN 98.0 2009-09-10 796.0 1696.0 3496.0 Half1 0 2 0 1 1.0 0.0 13:16 TEN 2 10 0 (13:16) C.Johnson up the middle to TEN 2 for n... run 0.0 0 0 0.0 0 0 0 NaN NaN NaN NaN middle NaN NaN NaN NaN NaN 3 3 0.0 NaN NaN 3.0 3.0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.002944 0.236918 0.025923 0.370074 0.144685 0.003099 0.216357 0.0 0.0 -1.398360 -0.696302 1.771136 -1.771136 -0.706458 0.706458 -1.634676 1.634676 NaN NaN 0.000000 0.000000 -0.938735 0.938735 1.016643 -1.016643 2.473837 -2.473837 -4.108513 4.108513 0.441071 0.558929 0.558929 0.441071 -0.019524 0.578453 0.421547 -0.020771 0.020771 -0.044921 0.044921 NaN NaN 0.000000 0.000000 -0.028383 0.028383 0.033038 -0.033038 0.081542 -0.081542 -0.126463 0.126463 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 NaN NaN NaN NaN 00-0026164 C.Johnson NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 00-0021344 B.Keisel PIT 00-0005082 J.Farrior PIT NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.0 NaN NaN NaN NaN 0 NaN NaN 0.0 0.0 0.0 0.0
6 183 2009091000 PIT TEN TEN away PIT TEN 98.0 2009-09-10 760.0 1660.0 3460.0 Half1 0 2 0 1 2.0 0.0 12:40 TEN 2 10 4 (12:40) K.Collins pass short left to A.Hall to... pass 4.0 0 0 1.0 0 0 0 short left 3.0 1.0 NaN NaN NaN NaN NaN NaN 3 3 0.0 NaN NaN 3.0 3.0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.003283 0.264022 0.024132 0.409397 0.116551 0.003305 0.179311 0.0 0.0 -2.094662 -0.179149 1.950285 -1.950285 -0.706458 0.706458 -1.455527 1.455527 -0.521641 0.342492 -0.521641 0.342492 -0.417094 0.417094 0.674150 -0.674150 2.995478 -2.995478 -4.451005 4.451005 0.421547 0.578453 0.578453 0.421547 -0.004427 0.582881 0.417119 -0.020771 0.020771 -0.040494 0.040494 -0.016088 0.011661 -0.016088 0.011661 -0.012296 0.012296 0.021378 -0.021378 0.097630 -0.097630 -0.138124 0.138124 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 00-0003292 K.Collins 00-0024489 A.Hall NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN PIT NaN 00-0022119 NaN T.Polamalu NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.0 NaN NaN NaN NaN 0 NaN NaN 0.0 0.0 0.0 0.0
7 207 2009091000 PIT TEN TEN away PIT TEN 94.0 2009-09-10 731.0 1631.0 3431.0 Half1 0 2 0 1 3.0 0.0 12:11 TEN 6 6 2 (12:11) (Shotgun) C.Johnson left end to TEN 4 ... run -2.0 1 0 0.0 0 0 0 NaN NaN NaN NaN left end NaN NaN NaN NaN 3 3 0.0 NaN NaN 3.0 3.0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.003244 0.282791 0.018507 0.420930 0.086777 0.003299 0.184451 0.0 0.0 -2.273811 -1.119477 3.069762 -3.069762 0.413020 -0.413020 -1.455527 1.455527 NaN NaN 0.000000 0.000000 -0.417094 0.417094 0.674150 -0.674150 2.995478 -2.995478 -4.451005 4.451005 0.417119 0.582881 0.582881 0.417119 -0.034663 0.617544 0.382456 0.013892 -0.013892 -0.040494 0.040494 NaN NaN 0.000000 0.000000 -0.012296 0.012296 0.021378 -0.021378 0.097630 -0.097630 -0.138124 0.138124 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN NaN 00-0026164 C.Johnson NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 00-0022119 T.Polamalu NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN PIT NaN 00-0022119 NaN T.Polamalu NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.0 NaN NaN NaN NaN 0 NaN NaN 0.0 0.0 0.0 0.0
8 228 2009091000 PIT TEN TEN away PIT TEN 96.0 2009-09-10 694.0 1594.0 3394.0 Half1 0 2 0 1 4.0 0.0 11:34 TEN 4 8 2 (11:34) (Punt formation) C.Hentrich punts 50 y... punt 0.0 0 0 0.0 0 0 0 NaN NaN NaN NaN NaN NaN NaN 50.0 NaN NaN 3 3 0.0 NaN NaN 3.0 3.0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.003023 0.337300 0.013430 0.483402 0.023641 0.003213 0.135991 0.0 0.0 -3.393288 -0.021313 3.091075 -3.091075 0.413020 -0.413020 -1.455527 1.455527 NaN NaN 0.000000 0.000000 -0.417094 0.417094 0.674150 -0.674150 2.995478 -2.995478 -4.451005 4.451005 0.382456 0.617544 0.617544 0.382456 0.026054 0.591489 0.408511 0.013892 -0.013892 -0.040494 0.040494 NaN NaN 0.000000 0.000000 -0.012296 0.012296 0.021378 -0.021378 0.097630 -0.097630 -0.138124 0.138124 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 00-0026491 S.Logan NaN NaN NaN NaN NaN NaN 00-0007308 C.Hentrich NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN TEN NaN 00-0025406 NaN M.Griffin NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN PIT 11.0 NaN NaN NaN NaN 0 NaN NaN 0.0 0.0 0.0 0.0
9 253 2009091000 PIT TEN PIT home TEN TEN 43.0 2009-09-10 684.0 1584.0 3384.0 Half1 0 3 0 1 1.0 0.0 11:24 TEN 43 10 3 (11:24) B.Roethlisberger pass short right to M... pass 3.0 0 0 1.0 0 0 0 short right -2.0 5.0 NaN NaN NaN NaN NaN NaN 3 3 0.0 NaN NaN 3.0 3.0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 0.001494 0.069965 0.000275 0.107951 0.336207 0.003334 0.480774 0.0 0.0 3.414601 -0.215293 2.875783 -2.875783 0.413020 -0.413020 -1.670819 1.670819 -0.680790 0.465497 -0.680790 0.465497 -1.097884 1.097884 1.139648 -1.139648 2.314689 -2.314689 -3.985508 3.985508 0.591489 0.408511 0.591489 0.408511 -0.006084 0.585405 0.414595 0.013892 -0.013892 -0.046578 0.046578 -0.022089 0.016005 -0.022089 0.016005 -0.034385 0.034385 0.037383 -0.037383 0.075540 -0.075540 -0.122118 0.122118 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 00-0022924 B.Roethlisberger 00-0026901 M.Wallace NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN TEN NaN 00-0026243 NaN W.Hayes NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.0 NaN NaN NaN NaN 0 NaN NaN 0.0 0.0 0.0 0.0
In [5]:
NFL.shape
Out[5]:
(449371, 255)
In [6]:
NFL.columns
Out[6]:
Index(['play_id', 'game_id', 'home_team', 'away_team', 'posteam',
       'posteam_type', 'defteam', 'side_of_field', 'yardline_100', 'game_date',
       ...
       'penalty_player_id', 'penalty_player_name', 'penalty_yards',
       'replay_or_challenge', 'replay_or_challenge_result', 'penalty_type',
       'defensive_two_point_attempt', 'defensive_two_point_conv',
       'defensive_extra_point_attempt', 'defensive_extra_point_conv'],
      dtype='object', length=255)
In [7]:
NFL['home_team'].value_counts()
Out[7]:
PHI    14510
IND    14322
OAK    14320
CIN    14300
DEN    14290
NE     14270
BUF    14226
GB     14176
CLE    14166
BAL    14158
NYJ    14136
CHI    14112
ARI    14091
KC     14079
DET    14009
SF     14009
NYG    14008
NO     14003
PIT    14001
MIA    13974
TB     13967
HOU    13959
TEN    13920
ATL    13871
MIN    13808
DAL    13788
WAS    13781
CAR    13770
SEA    13603
SD     11007
JAC    10035
STL     9842
LA      4247
JAX     4084
LAC     2529
Name: home_team, dtype: int64
In [8]:
NFL.describe()
Out[8]:
play_id game_id yardline_100 quarter_seconds_remaining half_seconds_remaining game_seconds_remaining quarter_end drive sp qtr down goal_to_go ydstogo ydsnet yards_gained shotgun no_huddle qb_dropback qb_kneel qb_spike qb_scramble air_yards yards_after_catch kick_distance home_timeouts_remaining away_timeouts_remaining timeout posteam_timeouts_remaining defteam_timeouts_remaining total_home_score total_away_score posteam_score defteam_score score_differential posteam_score_post defteam_score_post score_differential_post no_score_prob opp_fg_prob opp_safety_prob opp_td_prob fg_prob safety_prob td_prob extra_point_prob two_point_conversion_prob ep epa total_home_epa total_away_epa total_home_rush_epa total_away_rush_epa total_home_pass_epa total_away_pass_epa air_epa yac_epa comp_air_epa comp_yac_epa total_home_comp_air_epa total_away_comp_air_epa total_home_comp_yac_epa total_away_comp_yac_epa total_home_raw_air_epa total_away_raw_air_epa total_home_raw_yac_epa total_away_raw_yac_epa wp def_wp home_wp away_wp wpa home_wp_post away_wp_post total_home_rush_wpa total_away_rush_wpa total_home_pass_wpa total_away_pass_wpa air_wpa yac_wpa comp_air_wpa comp_yac_wpa total_home_comp_air_wpa total_away_comp_air_wpa total_home_comp_yac_wpa total_away_comp_yac_wpa total_home_raw_air_wpa total_away_raw_air_wpa total_home_raw_yac_wpa total_away_raw_yac_wpa punt_blocked first_down_rush first_down_pass first_down_penalty third_down_converted third_down_failed fourth_down_converted fourth_down_failed incomplete_pass interception punt_inside_twenty punt_in_endzone punt_out_of_bounds punt_downed punt_fair_catch kickoff_inside_twenty kickoff_in_endzone kickoff_out_of_bounds kickoff_downed kickoff_fair_catch fumble_forced fumble_not_forced fumble_out_of_bounds solo_tackle safety penalty tackled_for_loss fumble_lost own_kickoff_recovery own_kickoff_recovery_td qb_hit rush_attempt pass_attempt sack touchdown pass_touchdown rush_touchdown return_touchdown extra_point_attempt two_point_attempt field_goal_attempt kickoff_attempt punt_attempt fumble complete_pass assist_tackle lateral_reception lateral_rush lateral_return lateral_recovery lateral_sack_player_id lateral_sack_player_name assist_tackle_4_player_id assist_tackle_4_player_name assist_tackle_4_team fumble_recovery_1_yards fumble_recovery_2_yards return_yards penalty_yards replay_or_challenge defensive_two_point_attempt defensive_two_point_conv defensive_extra_point_attempt defensive_extra_point_conv
count 449371.000000 4.493710e+05 436301.000000 449230.000000 449206.000000 449208.000000 449371.000000 449371.000000 449371.000000 449371.000000 381409.000000 436664.000000 449371.000000 449371.000000 449158.000000 449371.000000 449371.000000 436497.000000 449371.000000 449371.000000 449371.000000 175719.000000 108907.000000 50740.000000 449371.000000 449371.000000 436497.000000 436492.000000 436492.000000 449371.000000 449371.000000 433952.000000 433952.000000 433952.000000 436492.000000 436492.000000 436492.000000 448693.000000 448693.000000 448693.000000 448693.000000 448693.000000 448693.000000 448693.000000 449371.000000 449371.000000 435812.000000 433535.000000 449371.000000 449371.000000 449371.000000 449371.000000 449371.000000 449371.000000 174082.000000 173699.000000 436299.000000 436111.000000 449371.000000 449371.000000 449371.000000 449371.000000 449371.000000 449371.000000 449371.000000 449371.000000 433313.000000 4.333130e+05 435388.000000 435388.000000 444050.000000 433334.000000 433334.000000 449371.000000 449371.000000 449371.000000 449371.000000 173940.000000 173724.000000 436156.000000 436064.000000 449371.000000 449371.000000 449371.000000 449371.000000 449371.000000 449371.000000 449371.000000 449371.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 436497.000000 0.0 0.0 0.0 0.0 0.0 6026.000000 47.000000 449350.000000 32618.000000 449371.000000 436497.000000 436497.000000 436497.0 436497.0
mean 2140.689606 2.013620e+09 49.800658 413.968820 810.775709 1700.697033 0.017111 12.275053 0.072090 2.577696 1.999903 0.050329 7.302314 27.369207 3.903698 0.401844 0.067319 0.442837 0.008554 0.001598 0.015539 8.313114 5.216919 40.594166 2.515939 2.492900 0.042740 2.510135 2.527022 12.032764 10.560561 10.560958 11.718642 -1.157683 10.745858 11.664159 -0.918301 0.127358 0.093919 0.002469 0.138923 0.243093 0.002620 0.295061 0.023525 0.000733 1.695686 0.008488 1.692482 -1.692482 0.348168 -0.348168 1.069312 -1.069312 0.509500 -0.352777 0.055941 0.167436 0.286872 -0.286872 0.448355 -0.448355 0.313304 -0.313304 0.590454 -0.590454 0.505785 4.942155e-01 0.539233 0.460688 0.001968 0.539592 0.460319 0.011422 -0.011422 0.026271 -0.026271 0.014674 -0.009852 0.001639 0.004537 0.007224 -0.007224 0.013766 -0.013766 0.005023 -0.005023 0.017402 -0.017402 0.000305 0.069391 0.139025 0.020857 0.059893 0.094743 0.005356 0.005588 0.142704 0.010607 0.019116 0.000032 0.005189 0.007285 0.013496 0.008126 0.014839 0.000545 0.000057 0.000092 0.010087 0.004997 0.001196 0.480413 0.000387 0.074768 0.029762 0.006990 0.000213 0.000002 0.055471 0.312598 0.431313 0.027001 0.029400 0.017721 0.009214 0.001679 0.025544 0.001595 0.022481 0.059038 0.055020 0.015017 0.249828 0.124672 0.000215 0.000039 0.000192 0.001727 NaN NaN NaN NaN NaN 2.302688 4.808511 1.042773 8.487829 0.001823 0.000069 0.000014 0.0 0.0
std 1240.303671 2.842246e+06 25.062131 279.137304 554.712205 1053.533368 0.129684 7.124626 0.258637 1.129958 1.005693 0.218624 4.877843 25.617677 7.884118 0.490271 0.250573 0.496722 0.092092 0.039940 0.123685 10.094080 7.230935 13.979314 0.784035 0.801472 0.202271 0.786341 0.768491 10.204185 9.540286 9.613147 10.010544 10.887777 9.687207 10.027421 10.937217 0.199420 0.071936 0.003426 0.110381 0.159276 0.001473 0.168736 0.149043 0.018620 1.751795 1.327887 12.171866 12.171866 5.461487 5.461487 10.801492 10.801492 1.455952 2.008401 0.654974 0.630293 6.450182 6.450182 6.722098 6.722098 9.954506 9.954506 12.814876 12.814876 0.290271 2.902708e-01 0.289035 0.289024 0.045293 0.291174 0.291162 0.156981 0.156981 0.275887 0.275887 0.058077 0.068841 0.023699 0.021430 0.196361 0.196361 0.205740 0.205740 0.320486 0.320486 0.380072 0.380072 0.017453 0.254118 0.345973 0.142906 0.237288 0.292860 0.072990 0.074542 0.349771 0.102444 0.136932 0.005663 0.071848 0.085042 0.115386 0.089778 0.120907 0.023344 0.007568 0.009572 0.099927 0.070510 0.034561 0.499617 0.019673 0.263017 0.169930 0.083312 0.014595 0.001514 0.228898 0.463552 0.495260 0.162087 0.168925 0.131934 0.095548 0.040945 0.157771 0.039900 0.148243 0.235696 0.228019 0.121622 0.432914 0.330347 0.014673 0.006241 0.013871 0.041526 NaN NaN NaN NaN NaN 9.066064 14.961746 5.462253 5.323953 0.042652 0.008290 0.003708 0.0 0.0
min 35.000000 2.009091e+09 1.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 1.000000 1.000000 0.000000 0.000000 -87.000000 -38.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 -70.000000 -81.000000 -3.000000 -4.000000 -1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 -59.000000 0.000000 0.000000 -59.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 -3.836488 -12.849594 -51.355447 -68.932906 -31.000804 -32.094625 -46.814602 -56.430545 -9.803719 -14.000000 -9.803719 -10.086693 -31.016466 -31.967276 -32.399307 -35.486008 -67.710484 -64.050944 -70.993222 -84.082169 0.000000 2.220446e-16 0.000000 0.000000 -0.997214 0.000000 0.000000 -1.597957 -1.728627 -1.522974 -1.808344 -0.999881 -0.986673 -0.997346 -0.975057 -1.878256 -2.087053 -2.360388 -2.389044 -4.300814 -3.206215 -4.244676 -4.085693 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 NaN NaN NaN NaN NaN -34.000000 -16.000000 -18.000000 0.000000 0.000000 0.000000 0.000000 0.0 0.0
25% 1074.000000 2.011111e+09 31.000000 152.000000 286.000000 784.000000 0.000000 6.000000 0.000000 2.000000 1.000000 0.000000 3.000000 5.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 2.000000 1.000000 32.000000 2.000000 2.000000 0.000000 2.000000 2.000000 3.000000 3.000000 3.000000 3.000000 -7.000000 3.000000 3.000000 -7.000000 0.002706 0.034016 0.000100 0.039040 0.151493 0.001846 0.189431 0.000000 0.000000 0.483207 -0.611316 -5.043612 -8.316395 -2.667086 -3.433534 -5.071480 -7.061097 -0.586390 -0.658163 0.000000 0.000000 -3.380103 -3.967464 -3.421569 -4.357901 -5.236962 -5.875767 -6.452613 -7.377035 0.276884 2.627513e-01 0.325887 0.216823 -0.014464 0.322792 0.213196 -0.078300 -0.103365 -0.154157 -0.215297 -0.014806 -0.014275 0.000000 0.000000 -0.099950 -0.117827 -0.099230 -0.131210 -0.155568 -0.171809 -0.191146 -0.225458 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 NaN NaN NaN NaN NaN 0.000000 0.000000 0.000000 5.000000 0.000000 0.000000 0.000000 0.0 0.0
50% 2125.000000 2.013123e+09 52.000000 396.000000 797.000000 1800.000000 0.000000 12.000000 0.000000 3.000000 2.000000 0.000000 9.000000 21.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 6.000000 3.000000 41.000000 3.000000 3.000000 0.000000 3.000000 3.000000 10.000000 9.000000 9.000000 10.000000 0.000000 9.000000 10.000000 0.000000 0.024371 0.082200 0.000966 0.122752 0.231043 0.002980 0.313676 0.000000 0.000000 1.501881 -0.030269 1.289195 -1.289195 0.281556 -0.281556 0.774242 -0.774242 -0.010430 0.000000 0.000000 0.000000 0.125292 -0.125292 0.227606 -0.227606 0.165482 -0.165482 0.111352 -0.111352 0.510102 4.898976e-01 0.539024 0.460891 0.000000 0.541220 0.458702 0.009235 -0.009235 0.025942 -0.025942 -0.000016 0.000000 0.000000 0.000000 0.003990 -0.003990 0.009112 -0.009112 0.003979 -0.003979 0.009264 -0.009264 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 NaN NaN NaN NaN NaN 0.000000 0.000000 0.000000 5.000000 0.000000 0.000000 0.000000 0.0 0.0
75% 3180.000000 2.016103e+09 71.000000 656.000000 1288.000000 2583.000000 0.000000 18.000000 0.000000 4.000000 3.000000 0.000000 10.000000 46.000000 6.000000 1.000000 0.000000 1.000000 0.000000 0.000000 0.000000 13.000000 7.000000 51.000000 3.000000 3.000000 0.000000 3.000000 3.000000 19.000000 17.000000 17.000000 17.000000 5.000000 17.000000 17.000000 6.000000 0.171678 0.149298 0.003808 0.225529 0.326046 0.003578 0.407688 0.000000 0.000000 2.980545 0.571199 8.316395 5.043612 3.433534 2.667086 7.061097 5.071480 1.549017 0.564069 0.000000 0.000000 3.967464 3.380103 4.357901 3.421569 5.875767 5.236962 7.377035 6.452613 0.737249 7.231158e-01 0.783062 0.673962 0.014409 0.786658 0.677075 0.103365 0.078300 0.215297 0.154157 0.038549 0.012740 0.000000 0.000000 0.117827 0.099950 0.131210 0.099230 0.171809 0.155568 0.225458 0.191146 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 1.000000 1.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 NaN NaN NaN NaN NaN 0.000000 4.000000 0.000000 10.000000 0.000000 0.000000 0.000000 0.0 0.0
max 5706.000000 2.018122e+09 99.000000 900.000000 1800.000000 3600.000000 1.000000 38.000000 1.000000 5.000000 4.000000 1.000000 50.000000 99.000000 99.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 84.000000 90.000000 79.000000 3.000000 3.000000 1.000000 3.000000 3.000000 61.000000 58.000000 61.000000 61.000000 59.000000 61.000000 61.000000 59.000000 1.000000 0.360177 0.031461 0.496874 0.994605 0.015177 0.912963 0.993128 0.473500 6.500900 9.508015 68.932906 51.355447 32.094625 31.000804 56.430545 46.814602 8.150204 9.515020 6.506550 9.515020 31.967276 31.016466 35.486008 32.399307 64.050944 67.710484 84.082169 70.993222 1.000000 1.000000e+00 1.000000 1.000000 0.994848 1.000000 1.000000 1.728627 1.597957 1.808344 1.522974 0.994848 0.999942 0.994848 0.999942 2.087053 1.878256 2.389044 2.360388 3.206215 4.300814 4.085693 4.244676 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 1.000000 NaN NaN NaN NaN NaN 93.000000 77.000000 109.000000 66.000000 1.000000 1.000000 1.000000 0.0 0.0

Data Cleaning¶

There are many fields in the data that represent the probability of a given action occuring (such as "fg_prob", which represents the probability the team will kick a field goal on that given play). While these advanced metrics are interestinfg, they are not necessary for the analysis and may be removed. This will leave the data with 168 of the 255 columns. 3 teams also changed their team abbreviation during the time span of the data. Since they are the same team, just have a different abbreviation, the old abbreviations were replaced with the new abbreviations.

Delete Unneeded Columns¶

Given the large number of variables, many of them can be discarded. Specificaly the statistical columns for probability after a given play.

In [9]:
col_del_list = [
  "no_score_prob","opp_fg_prob","opp_safety_prob","opp_td_prob","fg_prob",
  "safety_prob","td_prob","extra_point_prob","two_point_conversion_prob",
  "ep","epa","total_home_epa","total_away_epa","total_home_rush_epa",
  "total_away_rush_epa","total_home_pass_epa","air_epa","yac_epa","comp_air_epa",
  "comp_yac_epa","total_home_comp_air_epa","total_away_comp_air_epa",
  "total_home_comp_yac_epa","total_away_comp_yac_epa","total_home_raw_air_epa",
  "total_away_raw_air_epa","total_home_raw_yac_epa","total_away_raw_yac_epa", 
  "wp", "def_wp","home_wp","away_wp","wpa","home_wp_post","away_wp_post",
  "total_home_rush_wpa","total_away_rush_wpa","total_home_pass_wpa",
  "total_away_pass_wpa","air_wpa","yac_wpa","comp_air_wpa","comp_yac_wpa",
  "total_home_comp_air_wpa","total_away_comp_air_wpa","total_home_comp_yac_wpa",
  "total_home_raw_air_wpa","total_away_raw_air_wpa","total_home_raw_yac_wpa",
  "total_away_raw_yac_wpa","passer_player_id","receiver_player_id",
  "rusher_player_id","lateral_receiver_player_id","lateral_rusher_player_id",
  "lateral_sack_player_id","interception_player_id",
  "lateral_interception_player_id","punt_returner_player_id",
  "lateral_punt_returner_player_id","kickoff_returner_player_id",
  "lateral_kickoff_returner_player_id","punter_player_id","kicker_player_id",
  "own_kickoff_recovery_player_id","blocked_player_id",
  "tackle_for_loss_1_player_id","tackle_for_loss_2_player_id",
  "qb_hit_1_player_id","qb_hit_2_player_id","forced_fumble_player_1_player_id",
  "forced_fumble_player_2_player_id",
  "solo_tackle_1_player_id","solo_tackle_2_player_id","assist_tackle_1_player_id",
  "assist_tackle_2_player_id","assist_tackle_3_player_id",
  "assist_tackle_4_player_id","pass_defense_1_player_id","pass_defense_2_player_id",
  "fumbled_1_player_id","fumbled_2_player_id","fumble_recovery_1_team",
  "fumble_recovery_1_player_id","fumble_recovery_2_player_id","penalty_player_id",
  "total_away_pass_epa"
]

for column in col_del_list:
    del NFL[column]
In [10]:
len(col_del_list)
Out[10]:
87

Team Replacement¶

There are 32 NFL teams. The data spans from 2009 - 2018 and shows 35 unique teams. During that time frame, 3 NFL teams changed their team abbreviation. The Chargers moved from San Diego, CA to Los Angeles, CA in 2017, therefore their team abbreviation changed from SD to LAC. In 2013, a social media campaign and fan petition saw the Jaguars change their team abbreviation from JAC to JAX. Finally, in 2016 the Rams moved from St. Louis, MO to Los Angeles, CA and their team abbreviation changed from STL to LA. The first step in cleaning the data is to convert the old abbreviations to the new abbreviations in order to analyze team performance as a whole.

In [11]:
NFL.replace("JAC","JAX",inplace= True)
NFL.replace("SD","LAC",inplace= True)
NFL.replace("STL","LA",inplace= True)

Adding Columns¶

In [12]:
# Extract the year
#NFL['game_date'] = pd.to_datetime(NFL['game_date']).dt.date
NFL['game_year'] = pd.to_datetime(NFL['game_date']).dt.year

# Determine the home and away team divis b ons
NFL['posteam_division'] = NFL['posteam'].apply(team_division)
NFL['defteam_division']= NFL['defteam'].apply(team_division)

NFL['posteam_confrence'] = NFL['posteam'].apply(team_confrence)
NFL['defteam_confrence']= NFL['defteam'].apply(team_confrence)

NFL['penalty_team_division'] = NFL['penalty_team'].apply(team_division)
NFL['penalty_team_confrence'] = NFL['penalty_team'].apply(team_confrence)

NFL['penalty_side'] = NFL.apply(lambda x: penalty_type(x.penalty_team, x.posteam), axis=1)

Scrape NFL.com for team win/loss¶

In [13]:
def scrape_records():
    odf = pd.DataFrame() 
    for year in range(2009,2019):
        url=f'https://www.nfl.com/standings/division/{year}/REG'
        for sdf in pd.read_html(url):
            sdf = sdf.iloc[:,:4]
            sdf.columns = ['team','W','L','T']
            sdf['season'] = f'season_{year}'
            odf = odf.append(sdf)
    def get_team_abbr(name):
        team_dict = {
            'browns':'CLE','bucc':'TB','bills':'BUF','titans':'TEN','dolphins':'MIA',
            'jaguars':'JAX','jets':'NYJ','washington':'WAS','bears':'CHI', 'giants':'NYG',
            'cardinals':'ARI','49ers':'SF','rams':'LA','lions':'DET','seahawks':'SEA',
            'panthers':'CAR','cowboys':'DAL','chargers':'LAC','vikings':'MIN','falcons':'ATL',
            'texans':'HOU','broncos':'DEN','eagles':'PHI','patriots':'NE','steelers':'PIT',
            'saints':'NO','ravens':'BAL','bengals':'CIN','colts':'IND','chiefs':'KC',
            'packers':'GB','raiders':'OAK'
        }
        for key in team_dict.keys():
            if key in name.lower(): return team_dict[key]
        return 'NOT FOUND'
    odf['team']=odf.team.apply(get_team_abbr)
    return odf.reset_index(drop=True)
records = scrape_records()
records['record'] = records['W'].astype(str)+"-"+records['L'].astype(str)+"-"+records['T'].astype(str)
In [14]:
def get_season(date):
    if type(date)==str: date = pd.to_datetime(date)
    year = date.year
    if date.month<5: year = year-1
    return f'season_{year}'
NFL['season'] = pd.to_datetime(NFL['game_date']).apply(get_season)

NFL.head()
Out[14]:
play_id game_id home_team away_team posteam posteam_type defteam side_of_field yardline_100 game_date quarter_seconds_remaining half_seconds_remaining game_seconds_remaining game_half quarter_end drive sp qtr down goal_to_go time yrdln ydstogo ydsnet desc play_type yards_gained shotgun no_huddle qb_dropback qb_kneel qb_spike qb_scramble pass_length pass_location air_yards yards_after_catch run_location run_gap field_goal_result kick_distance extra_point_result two_point_conv_result home_timeouts_remaining away_timeouts_remaining timeout timeout_team td_team posteam_timeouts_remaining defteam_timeouts_remaining total_home_score total_away_score posteam_score defteam_score score_differential posteam_score_post defteam_score_post score_differential_post total_away_comp_yac_wpa punt_blocked first_down_rush first_down_pass first_down_penalty third_down_converted third_down_failed fourth_down_converted fourth_down_failed incomplete_pass interception punt_inside_twenty punt_in_endzone punt_out_of_bounds punt_downed punt_fair_catch kickoff_inside_twenty kickoff_in_endzone kickoff_out_of_bounds kickoff_downed kickoff_fair_catch fumble_forced fumble_not_forced fumble_out_of_bounds solo_tackle safety penalty tackled_for_loss fumble_lost own_kickoff_recovery own_kickoff_recovery_td qb_hit rush_attempt pass_attempt sack touchdown pass_touchdown rush_touchdown return_touchdown extra_point_attempt two_point_attempt field_goal_attempt kickoff_attempt punt_attempt fumble complete_pass assist_tackle lateral_reception lateral_rush lateral_return lateral_recovery passer_player_name receiver_player_name rusher_player_name lateral_receiver_player_name lateral_rusher_player_name lateral_sack_player_name interception_player_name lateral_interception_player_name punt_returner_player_name lateral_punt_returner_player_name kickoff_returner_player_name lateral_kickoff_returner_player_name punter_player_name kicker_player_name own_kickoff_recovery_player_name blocked_player_name tackle_for_loss_1_player_name tackle_for_loss_2_player_name qb_hit_1_player_name qb_hit_2_player_name forced_fumble_player_1_team forced_fumble_player_1_player_name forced_fumble_player_2_team forced_fumble_player_2_player_name solo_tackle_1_team solo_tackle_2_team solo_tackle_1_player_name solo_tackle_2_player_name assist_tackle_1_player_name assist_tackle_1_team assist_tackle_2_player_name assist_tackle_2_team assist_tackle_3_player_name assist_tackle_3_team assist_tackle_4_player_name assist_tackle_4_team pass_defense_1_player_name pass_defense_2_player_name fumbled_1_team fumbled_1_player_name fumbled_2_player_name fumbled_2_team fumble_recovery_1_yards fumble_recovery_1_player_name fumble_recovery_2_team fumble_recovery_2_yards fumble_recovery_2_player_name return_team return_yards penalty_team penalty_player_name penalty_yards replay_or_challenge replay_or_challenge_result penalty_type defensive_two_point_attempt defensive_two_point_conv defensive_extra_point_attempt defensive_extra_point_conv game_year posteam_division defteam_division posteam_confrence defteam_confrence penalty_team_division penalty_team_confrence penalty_side season
0 46 2009091000 PIT TEN PIT home TEN TEN 30.0 2009-09-10 900.0 1800.0 3600.0 Half1 0 1 0 1 NaN 0.0 15:00 TEN 30 0 0 R.Bironas kicks 67 yards from TEN 30 to PIT 3.... kickoff 0.0 0 0 0.0 0 0 0 NaN NaN NaN NaN NaN NaN NaN 67.0 NaN NaN 3 3 0.0 NaN NaN 3.0 3.0 0 0 NaN NaN NaN 0.0 0.0 0.0 0.000000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN S.Logan NaN NaN R.Bironas NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN TEN NaN M.Griffin NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN PIT 39.0 NaN NaN NaN 0 NaN NaN 0.0 0.0 0.0 0.0 2009 AFC North AFC South AFC AFC Error: Team Not Found Error: Team Not Found Defensive season_2009
1 68 2009091000 PIT TEN PIT home TEN PIT 58.0 2009-09-10 893.0 1793.0 3593.0 Half1 0 1 0 1 1.0 0.0 14:53 PIT 42 10 5 (14:53) B.Roethlisberger pass short left to H.... pass 5.0 0 0 1.0 0 0 0 short left -3.0 8.0 NaN NaN NaN NaN NaN NaN 3 3 0.0 NaN NaN 3.0 3.0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 -0.033038 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 B.Roethlisberger H.Ward NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN TEN NaN C.Hope NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.0 NaN NaN NaN 0 NaN NaN 0.0 0.0 0.0 0.0 2009 AFC North AFC South AFC AFC Error: Team Not Found Error: Team Not Found Defensive season_2009
2 92 2009091000 PIT TEN PIT home TEN PIT 53.0 2009-09-10 856.0 1756.0 3556.0 Half1 0 1 0 1 2.0 0.0 14:16 PIT 47 5 2 (14:16) W.Parker right end to PIT 44 for -3 ya... run -3.0 0 0 0.0 0 0 0 NaN NaN NaN NaN right end NaN NaN NaN NaN 3 3 0.0 NaN NaN 3.0 3.0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 -0.033038 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN W.Parker NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN S.Tulloch NaN NaN NaN NaN NaN NaN NaN TEN NaN S.Tulloch NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.0 NaN NaN NaN 0 NaN NaN 0.0 0.0 0.0 0.0 2009 AFC North AFC South AFC AFC Error: Team Not Found Error: Team Not Found Defensive season_2009
3 113 2009091000 PIT TEN PIT home TEN PIT 56.0 2009-09-10 815.0 1715.0 3515.0 Half1 0 1 0 1 3.0 0.0 13:35 PIT 44 8 2 (13:35) (Shotgun) B.Roethlisberger pass incomp... pass 0.0 1 0 1.0 0 0 0 deep right 34.0 NaN NaN NaN NaN NaN NaN NaN 3 3 0.0 NaN NaN 3.0 3.0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 -0.033038 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 B.Roethlisberger M.Wallace NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.0 NaN NaN NaN 0 NaN NaN 0.0 0.0 0.0 0.0 2009 AFC North AFC South AFC AFC Error: Team Not Found Error: Team Not Found Defensive season_2009
4 139 2009091000 PIT TEN PIT home TEN PIT 56.0 2009-09-10 807.0 1707.0 3507.0 Half1 0 1 0 1 4.0 0.0 13:27 PIT 44 8 2 (13:27) (Punt formation) D.Sepulveda punts 54 ... punt 0.0 0 0 0.0 0 0 0 NaN NaN NaN NaN NaN NaN NaN 54.0 NaN NaN 3 3 0.0 NaN NaN 3.0 3.0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 -0.033038 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN D.Sepulveda NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.0 NaN NaN NaN 0 NaN NaN 0.0 0.0 0.0 0.0 2009 AFC North AFC South AFC AFC Error: Team Not Found Error: Team Not Found Defensive season_2009

Tim: What are the effects of penalties on team performance?¶

Team Color Codes¶

In [15]:
# colors from https://teamcolorcodes.com/nfl-team-color-codes/
team_colors = {
    "NFL":['#003069','#d60303','#005d00'],
    "AFC": ['#003069','#d60303','#005d00'],
    "NFC": ['#003069','#d60303','#005d00'],
    'ARI': ['#97233F','#000000','#FFB612'],
     'ATL': ['#A71930','#000000','#A5ACAF'],
     'BAL': ['#241773','#000000','#9E7C0C'],
     'BUF': ['#00338D','#C60C30','gray'],
     'CAR': ['#0085CA','#101820','#BFC0BF'],
     'CHI': ['#0B162A','#C83803','gray'],
     'CIN': ['#FB4F14','#000000','gray'],
     'CLE': ['#311D00','#FF3C00','gray'],
     'DAL': ['#003594','#869397','black'],
     'DEN': ['#FB4F14','#002244','gray'],
     'DET': ['#0076B6','#B0B7BC','#000000'],
     'GB': ['#203731','#FFB612','gray'],
     'HOU': ['#03202F','#A71930','gray'],
     'IND': ['#002C5F','#A2AAAD','gray'],
     'JAX': ['#101820','#D7A22A','#006778'],
     'KC': ['#E31837','#FFB81C','#000000'],
     'LA': ['#003594','#FFD100','#FFA300'],
     'LAC': ['#0080C6','#FFC20E','gray'],
     'MIA': ['#008E97','#FC4C02','#005778'],
     'MIN': ['#4F2683','#FFC62F','gray'],
     'NE': ['#002244','#C60C30','#B0B7BC'],
     'NO': ['#D3BC8D','#101820','gray'],
     'NYG': ['#0B2265','#A71930','#A5ACAF'],
     'NYJ': ['#125740','#000000','#046A38'],
     'OAK': ['#000000','#A5ACAF','gray'],
     'PHI': ['#004C54','#004C54','#565A5C'],
     'PIT': ['#FFB612','#101820','#A5ACAF'],
     'SEA': ['#002244','#69BE28','#A5ACAF'],
     'SF': ['#AA0000','#B3995D','#000000'],
     'TB': ['#D50A0A',' #B1BABF','#0A0A08'],
     'TEN': ['#0C2340','#4B92DB','#C8102E'],
     'WAS': ['#773141','#FFB612','gray']
}

Further slice the data for penalty-related columns¶

In [16]:
penalty_slice = NFL.loc[NFL.penalty == 1.0]
penalty_df = penalty_slice[['penalty','penalty_team','penalty_player_name','penalty_yards','penalty_type',
                            'penalty_team_division','penalty_team_confrence', 'penalty_side',
                            'posteam','defteam',
                            'yardline_100','game_half','down','ydstogo','play_type',
                            'shotgun','no_huddle','qb_dropback','qb_scramble','pass_length','pass_location',
                            'air_yards','yards_after_catch','run_location','run_gap',
                            'rush_attempt','pass_attempt','sack','touchdown','posteam_division','defteam_division',
                            'posteam_confrence','defteam_confrence','season']]
penalty_df.head()
Out[16]:
penalty penalty_team penalty_player_name penalty_yards penalty_type penalty_team_division penalty_team_confrence penalty_side posteam defteam yardline_100 game_half down ydstogo play_type shotgun no_huddle qb_dropback qb_scramble pass_length pass_location air_yards yards_after_catch run_location run_gap rush_attempt pass_attempt sack touchdown posteam_division defteam_division posteam_confrence defteam_confrence season
15 1.0 PIT T.Polamalu 15.0 Unnecessary Roughness AFC North AFC Defensive TEN PIT 89.0 Half1 1.0 10 run 0 0 0.0 0 NaN NaN NaN NaN right end 1.0 0.0 0.0 0.0 AFC South AFC North AFC AFC season_2009
26 1.0 TEN D.Stewart 5.0 Illegal Formation AFC South AFC Offensive TEN PIT 64.0 Half1 1.0 10 no_play 0 0 0.0 0 NaN NaN NaN NaN NaN NaN 0.0 0.0 0.0 0.0 AFC South AFC North AFC AFC season_2009
46 1.0 PIT H.Ward 10.0 Offensive Holding AFC North AFC Offensive PIT TEN 70.0 Half1 2.0 1 run 0 0 0.0 0 NaN NaN NaN NaN right guard 1.0 0.0 0.0 0.0 AFC North AFC South AFC AFC season_2009
47 1.0 PIT M.Starks 10.0 Offensive Holding AFC North AFC Offensive PIT TEN 75.0 Half1 2.0 6 no_play 0 0 0.0 0 NaN NaN NaN NaN NaN NaN 0.0 0.0 0.0 0.0 AFC North AFC South AFC AFC season_2009
61 1.0 PIT T.Polamalu 15.0 NaN AFC North AFC Defensive TEN PIT 40.0 Half1 3.0 7 pass 1 0 1.0 0 short left 5.0 3.0 NaN NaN 0.0 1.0 0.0 0.0 AFC South AFC North AFC AFC season_2009

High Level of NFL Penalties¶

Penalties in the NFL¶

In [17]:
text = " ".join(review for review in penalty_df.penalty_type.astype(str).str.replace(" ", ""))
text = text.replace('nan','')
wordcloud = WordCloud(background_color="white", width=800, height=400, colormap='inferno_r').generate(text)
plt.axis("off")
plt.figure(figsize=(40,20))
plt.tight_layout(pad=0)
plt.imshow(wordcloud, interpolation='bilinear')
plt.title("Total NFL Penalties 2009 - 2018")
plt.show()

Average Defensive Pass Interference¶

In [18]:
defensivePI = penalty_df[penalty_df['penalty_type'] == 'Defensive Pass Interference']
defensivePI.penalty_yards.mean()
Out[18]:
17.499325842696628

Top 5 Penalties Commited by count¶

In [19]:
top5pen = penalty_df['penalty_type'].value_counts().head().to_frame().reset_index()
top5pen.columns = ['Penalty','Count']
top5pen
Out[19]:
Penalty Count
0 Offensive Holding 6250
1 False Start 5793
2 Defensive Pass Interference 2225
3 Unnecessary Roughness 1794
4 Defensive Holding 1731

Top 5 Penalties Commited by Yards¶

In [20]:
top5penyds = penalty_df.groupby(['penalty_type']).penalty_yards.sum().sort_values(ascending=False).head().to_frame().reset_index()
top5penyds.columns = ['Penalty','Total Yards']
top5penyds
Out[20]:
Penalty Total Yards
0 Offensive Holding 59788.0
1 Defensive Pass Interference 38936.0
2 False Start 28469.0
3 Unnecessary Roughness 24254.0
4 Roughing the Passer 12227.0
In [21]:
top5penyds.Penalty
Out[21]:
0              Offensive Holding
1    Defensive Pass Interference
2                    False Start
3          Unnecessary Roughness
4            Roughing the Passer
Name: Penalty, dtype: object
In [22]:
top5penyds['Total Yards'].sum()
Out[22]:
163674.0
In [23]:
penalty_df['penalty_yards'].sum()
Out[23]:
276856.0
In [24]:
round(top5penyds['Total Yards'].sum() / penalty_df['penalty_yards'].sum(),4)
Out[24]:
0.5912

NFL Penalties by type YoY¶

In [25]:
plt.figure(figsize=(15,10))
sns.barplot(x="season", y="penalty", hue="penalty_side", palette = team_colors['NFL'], saturation = 0.7, data=penalty_df.groupby(["penalty_side",'season'], as_index=False)["penalty"].sum())
plt.xticks(rotation = 90)
plt.title("Penalties By Side Per Year")
plt.show()
In [26]:
names = penalty_df['game_half'].unique().tolist()

labels = []

for i in penalty_df['game_half'].unique():
    labels.append(f"{i}: {round(len(penalty_df[penalty_df['game_half'] == i])/len(penalty_df['game_half']),4)*100}%")

size = [len(penalty_df[penalty_df['game_half'] == names[0]]),\
        len(penalty_df[penalty_df['game_half'] == names[1]]),\
       len(penalty_df[penalty_df['game_half'] == names[2]])]
size
# Create a circle at the center of the plot
my_circle = plt.Circle( (0,0), 0.6, color='white')

plt.figure(figsize=(10,15))

# Custom wedges
plt.pie(size, labels=labels, colors = team_colors['NFL'])
p = plt.gcf()
p.gca().add_artist(my_circle)
plt.title("NFL Penalties by Game Half 2009 - 2018")
plt.show()

Offensive Penaties: Select Team¶

In [27]:
def offensive_penalty_team(team):
    """ Produces a wordcloud of offensive penatly frequency of a user provided team abbreviation"""

    team_off_pen_df = penalty_df[(penalty_df['penalty_team'] == team) & (penalty_df['posteam'] == team)]

    text = " ".join(review for review in team_off_pen_df.penalty_type.astype(str).str.replace(" ", ""))
    text = text.replace('nan','')
    wordcloud = WordCloud(background_color="white", width=800, height=400, colormap='inferno_r').generate(text)
    plt.axis("off")
    plt.figure(figsize=(40,20))
    plt.tight_layout(pad=0)
    plt.imshow(wordcloud, interpolation='bilinear')
    plt.title(f"Total Offensive {team} Penalties 2009 - 2018")
    plt.show()
offensive_penalty_team("SF")

Defensive Penatlites: Select Team¶

In [28]:
def defensive_penalty_team(team):
    """ Produces a wordcloud of defensive penatly frequency of a user provided team abbreviation"""
    team_def_pen_df = penalty_df[(penalty_df['penalty_team'] == team) & (penalty_df['defteam'] == team)]

    text = " ".join(review for review in team_def_pen_df.penalty_type.astype(str).str.replace(" ", ""))
    text = text.replace('nan','')
    wordcloud = WordCloud(background_color="white", width=800, height=400, colormap='inferno_r').generate(text)
    plt.axis("off")
    plt.figure(figsize=(40,20))
    plt.tight_layout(pad=0)
    plt.imshow(wordcloud, interpolation='bilinear')
    plt.title(f"Total Defensive {team} Penalties 2009 - 2018")
    plt.show()
defensive_penalty_team("SF")
In [29]:
team_penalty_yards = penalty_df.groupby(['penalty_team','season','penalty_type','penalty_side'], as_index=False)['penalty_yards'].mean()
division_penalty_yards = penalty_df.groupby(['penalty_team_division','season','penalty_type','penalty_side'], as_index=False)['penalty_yards'].mean()
confrence_penalty_yards = penalty_df.groupby(['penalty_team_confrence','season','penalty_type','penalty_side'], as_index=False)['penalty_yards'].mean()
In [30]:
def avg_penalty_yards(team):
    """
    Aggregate Total, Offensive, and Defensive penalty yards of a team and return the 
    average penalty yards of a team in a barplot along with the average penalty yards
    of the team's confrence and division. It will also display the record for the season
    on the chart and color the plot with team colors
    """
    off_team = records.loc[records['team'] == team]
    
    record_list = off_team['record'].tolist()
    
    # Total
    total_3 = confrence_penalty_yards.loc[(confrence_penalty_yards['penalty_team_confrence'] == team_confrence(team))]
    total_3.rename(columns={'penalty_team_confrence':'Avg'}, inplace=True)
    total_2 = division_penalty_yards.loc[division_penalty_yards['penalty_team_division'] == team_division(team)]
    total_2.rename(columns={'penalty_team_division':'Avg'}, inplace=True)
    total_1 = team_penalty_yards.loc[team_penalty_yards['penalty_team'] == team]
    total_1.rename(columns={'penalty_team':'Avg'}, inplace=True)
    total_data = pd.concat([total_1,total_2,total_3],ignore_index=True)
    
    # Offensive
    off_3 = confrence_penalty_yards.loc[(confrence_penalty_yards['penalty_team_confrence'] == team_confrence(team)) &
                                   (confrence_penalty_yards['penalty_side'] == "Offensive")]
    off_3.rename(columns={'penalty_team_confrence':'Avg'}, inplace=True)
    off_2 = division_penalty_yards.loc[(division_penalty_yards['penalty_team_division'] == team_division(team)) & 
                                      (division_penalty_yards['penalty_side'] == "Offensive")]
    off_2.rename(columns={'penalty_team_division':'Avg'}, inplace=True)
    off_1 = team_penalty_yards.loc[(team_penalty_yards['penalty_team'] == team) &
                                  (team_penalty_yards['penalty_side'] == "Offensive")]
    off_1.rename(columns={'penalty_team':'Avg'}, inplace=True)
    off_data = pd.concat([off_1,off_2,off_3],ignore_index=True)
    
    # Defensive
    def_3 = confrence_penalty_yards.loc[(confrence_penalty_yards['penalty_team_confrence'] == team_confrence(team)) &
                                   (confrence_penalty_yards['penalty_side'] == "Defensive")]
    def_3.rename(columns={'penalty_team_confrence':'Avg'}, inplace=True)
    def_2 = division_penalty_yards.loc[(division_penalty_yards['penalty_team_division'] == team_division(team)) & 
                                      (division_penalty_yards['penalty_side'] == "Defensive")]
    def_2.rename(columns={'penalty_team_division':'Avg'}, inplace=True)
    def_1 = team_penalty_yards.loc[(team_penalty_yards['penalty_team'] == team) &
                                  (team_penalty_yards['penalty_side'] == "Defensive")]
    def_1.rename(columns={'penalty_team':'Avg'}, inplace=True)
    def_data = pd.concat([def_1,def_2,def_3],ignore_index=True)
    
    # Plots
    fig, axs = plt.subplots(nrows=3, figsize = (10,15))

    sns.barplot(x = 'season',
            y = 'penalty_yards',
            hue = 'Avg',
            data =  total_data,
            ci=None,
            palette = team_colors[team], ax=axs[0])
    sns.barplot(x = 'season',
            y = 'penalty_yards',
            hue = 'Avg',
            data =  off_data,
            ci=None,
            palette= team_colors[team], ax=axs[1])
    sns.barplot(x = 'season',
            y = 'penalty_yards',
            hue = 'Avg',
            data =  def_data,
            ci=None,
            palette= team_colors[team], ax=axs[2])
    

    
    titles = ['Total Avg Penalty Yards','Offensive Avg Penalty Yards','Defensive Avg Penalty Yards']
    counter = 0
    for chart in range(0,3):
        axs[chart].set_xticklabels(labels = total_1.season.unique().tolist(), rotation = 90)
        axs[chart].legend(bbox_to_anchor=(1.01, 1), loc=2, borderaxespad=0)
        axs[chart].title.set_text(titles[chart])
        axs[chart].set_ylim([0,12])
        axs[chart].set_xlabel("Season")
        axs[chart].set_ylabel("Avg Yards")
        
        for rec in range(len(record_list)):
            axs[chart].text(counter-0.31, 10.7, record_list[rec])
            counter +=1
        counter = 0
            
        
            
       
    plt.tight_layout()
    plt.show()

Test function and analyze historically low and high flagged teams¶

In [31]:
avg_penalty_yards("IND")
/Users/timhulak/opt/anaconda3/lib/python3.8/site-packages/pandas/core/frame.py:4296: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().rename(
In [32]:
avg_penalty_yards("NYJ")
/Users/timhulak/opt/anaconda3/lib/python3.8/site-packages/pandas/core/frame.py:4296: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().rename(
In [33]:
avg_penalty_yards("OAK")
/Users/timhulak/opt/anaconda3/lib/python3.8/site-packages/pandas/core/frame.py:4296: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().rename(

Analyze Gamre Half penalties for teams¶

In [34]:
def team_penalties_game_half(team):
    team_penalty_df = penalty_df[penalty_df['penalty_team'] == team]

    names = team_penalty_df['game_half'].unique().tolist()

    labels = []

    for i in team_penalty_df['game_half'].unique():
        labels.append(f"{i}: {round(len(team_penalty_df[team_penalty_df['game_half'] == i])/len(team_penalty_df['game_half']),4)*100}%")

    size = [len(team_penalty_df[team_penalty_df['game_half'] == names[0]]),\
            len(team_penalty_df[team_penalty_df['game_half'] == names[1]]),\
           len(team_penalty_df[team_penalty_df['game_half'] == names[2]])]
    size
    # Create a circle at the center of the plot
    my_circle = plt.Circle( (0,0), 0.6, color='white')

    plt.figure(figsize=(10,15))

    # Custom wedges
    plt.pie(size, labels=labels, colors = team_colors[team])
    p = plt.gcf()
    p.gca().add_artist(my_circle)
    plt.title(f"{team} Penalties by Game Half 2009 - 2018")
    plt.show()
    
In [35]:
team_penalties_game_half("IND")
In [36]:
team_penalties_game_half("NYJ")
In [37]:
team_penalties_game_half("OAK")
In [38]:
NFL.head()
Out[38]:
play_id game_id home_team away_team posteam posteam_type defteam side_of_field yardline_100 game_date quarter_seconds_remaining half_seconds_remaining game_seconds_remaining game_half quarter_end drive sp qtr down goal_to_go time yrdln ydstogo ydsnet desc play_type yards_gained shotgun no_huddle qb_dropback qb_kneel qb_spike qb_scramble pass_length pass_location air_yards yards_after_catch run_location run_gap field_goal_result kick_distance extra_point_result two_point_conv_result home_timeouts_remaining away_timeouts_remaining timeout timeout_team td_team posteam_timeouts_remaining defteam_timeouts_remaining total_home_score total_away_score posteam_score defteam_score score_differential posteam_score_post defteam_score_post score_differential_post total_away_comp_yac_wpa punt_blocked first_down_rush first_down_pass first_down_penalty third_down_converted third_down_failed fourth_down_converted fourth_down_failed incomplete_pass interception punt_inside_twenty punt_in_endzone punt_out_of_bounds punt_downed punt_fair_catch kickoff_inside_twenty kickoff_in_endzone kickoff_out_of_bounds kickoff_downed kickoff_fair_catch fumble_forced fumble_not_forced fumble_out_of_bounds solo_tackle safety penalty tackled_for_loss fumble_lost own_kickoff_recovery own_kickoff_recovery_td qb_hit rush_attempt pass_attempt sack touchdown pass_touchdown rush_touchdown return_touchdown extra_point_attempt two_point_attempt field_goal_attempt kickoff_attempt punt_attempt fumble complete_pass assist_tackle lateral_reception lateral_rush lateral_return lateral_recovery passer_player_name receiver_player_name rusher_player_name lateral_receiver_player_name lateral_rusher_player_name lateral_sack_player_name interception_player_name lateral_interception_player_name punt_returner_player_name lateral_punt_returner_player_name kickoff_returner_player_name lateral_kickoff_returner_player_name punter_player_name kicker_player_name own_kickoff_recovery_player_name blocked_player_name tackle_for_loss_1_player_name tackle_for_loss_2_player_name qb_hit_1_player_name qb_hit_2_player_name forced_fumble_player_1_team forced_fumble_player_1_player_name forced_fumble_player_2_team forced_fumble_player_2_player_name solo_tackle_1_team solo_tackle_2_team solo_tackle_1_player_name solo_tackle_2_player_name assist_tackle_1_player_name assist_tackle_1_team assist_tackle_2_player_name assist_tackle_2_team assist_tackle_3_player_name assist_tackle_3_team assist_tackle_4_player_name assist_tackle_4_team pass_defense_1_player_name pass_defense_2_player_name fumbled_1_team fumbled_1_player_name fumbled_2_player_name fumbled_2_team fumble_recovery_1_yards fumble_recovery_1_player_name fumble_recovery_2_team fumble_recovery_2_yards fumble_recovery_2_player_name return_team return_yards penalty_team penalty_player_name penalty_yards replay_or_challenge replay_or_challenge_result penalty_type defensive_two_point_attempt defensive_two_point_conv defensive_extra_point_attempt defensive_extra_point_conv game_year posteam_division defteam_division posteam_confrence defteam_confrence penalty_team_division penalty_team_confrence penalty_side season
0 46 2009091000 PIT TEN PIT home TEN TEN 30.0 2009-09-10 900.0 1800.0 3600.0 Half1 0 1 0 1 NaN 0.0 15:00 TEN 30 0 0 R.Bironas kicks 67 yards from TEN 30 to PIT 3.... kickoff 0.0 0 0 0.0 0 0 0 NaN NaN NaN NaN NaN NaN NaN 67.0 NaN NaN 3 3 0.0 NaN NaN 3.0 3.0 0 0 NaN NaN NaN 0.0 0.0 0.0 0.000000 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN S.Logan NaN NaN R.Bironas NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN TEN NaN M.Griffin NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN PIT 39.0 NaN NaN NaN 0 NaN NaN 0.0 0.0 0.0 0.0 2009 AFC North AFC South AFC AFC Error: Team Not Found Error: Team Not Found Defensive season_2009
1 68 2009091000 PIT TEN PIT home TEN PIT 58.0 2009-09-10 893.0 1793.0 3593.0 Half1 0 1 0 1 1.0 0.0 14:53 PIT 42 10 5 (14:53) B.Roethlisberger pass short left to H.... pass 5.0 0 0 1.0 0 0 0 short left -3.0 8.0 NaN NaN NaN NaN NaN NaN 3 3 0.0 NaN NaN 3.0 3.0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 -0.033038 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 B.Roethlisberger H.Ward NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN TEN NaN C.Hope NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.0 NaN NaN NaN 0 NaN NaN 0.0 0.0 0.0 0.0 2009 AFC North AFC South AFC AFC Error: Team Not Found Error: Team Not Found Defensive season_2009
2 92 2009091000 PIT TEN PIT home TEN PIT 53.0 2009-09-10 856.0 1756.0 3556.0 Half1 0 1 0 1 2.0 0.0 14:16 PIT 47 5 2 (14:16) W.Parker right end to PIT 44 for -3 ya... run -3.0 0 0 0.0 0 0 0 NaN NaN NaN NaN right end NaN NaN NaN NaN 3 3 0.0 NaN NaN 3.0 3.0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 -0.033038 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN W.Parker NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN S.Tulloch NaN NaN NaN NaN NaN NaN NaN TEN NaN S.Tulloch NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.0 NaN NaN NaN 0 NaN NaN 0.0 0.0 0.0 0.0 2009 AFC North AFC South AFC AFC Error: Team Not Found Error: Team Not Found Defensive season_2009
3 113 2009091000 PIT TEN PIT home TEN PIT 56.0 2009-09-10 815.0 1715.0 3515.0 Half1 0 1 0 1 3.0 0.0 13:35 PIT 44 8 2 (13:35) (Shotgun) B.Roethlisberger pass incomp... pass 0.0 1 0 1.0 0 0 0 deep right 34.0 NaN NaN NaN NaN NaN NaN NaN 3 3 0.0 NaN NaN 3.0 3.0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 -0.033038 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 B.Roethlisberger M.Wallace NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.0 NaN NaN NaN 0 NaN NaN 0.0 0.0 0.0 0.0 2009 AFC North AFC South AFC AFC Error: Team Not Found Error: Team Not Found Defensive season_2009
4 139 2009091000 PIT TEN PIT home TEN PIT 56.0 2009-09-10 807.0 1707.0 3507.0 Half1 0 1 0 1 4.0 0.0 13:27 PIT 44 8 2 (13:27) (Punt formation) D.Sepulveda punts 54 ... punt 0.0 0 0 0.0 0 0 0 NaN NaN NaN NaN NaN NaN NaN 54.0 NaN NaN 3 3 0.0 NaN NaN 3.0 3.0 0 0 0.0 0.0 0.0 0.0 0.0 0.0 -0.033038 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN D.Sepulveda NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN 0.0 NaN NaN NaN 0 NaN NaN 0.0 0.0 0.0 0.0 2009 AFC North AFC South AFC AFC Error: Team Not Found Error: Team Not Found Defensive season_2009
In [ ]:
plt.scatter()